Search CORE

85 research outputs found

Learning mutational graphs of individual tumour evolution from single-cell and multi-region sequencing data

Author: Alessandro Tanca (494538)
Antonio Palomba (494539)
Cristina Fraumene (374294)
Edoardo Fiorillo (518797)
Francesco Cucca (145742)
Marcello Abbondio (3706183)
Sergio Uzzau (186221)
Valeria Manghina (3498188)
Publication venue
Publication date: 22/03/2019
Field of study

Background. A large number of algorithms is being developed to reconstruct evolutionary models of individual tumours from genome sequencing data. Most methods can analyze multiple samples collected either through bulk multi-region sequencing experiments or the sequencing of individual cancer cells. However, rarely the same method can support both data types. Results. We introduce TRaIT, a computational framework to infer mutational graphs that model the accumulation of multiple types of somatic alterations driving tumour evolution. Compared to other tools, TRaIT supports multi-region and single-cell sequencing data within the same statistical framework, and delivers expressive models that capture many complex evolutionary phenomena. TRaIT improves accuracy, robustness to data-specific errors and computational complexity compared to competing methods. Conclusions. We show that the application of TRaIT to single-cell and multi-region cancer datasets can produce accurate and reliable models of single-tumour evolution, quantify the extent of intra-tumour heterogeneity and generate new testable experimental hypotheses

arXiv.org e-Print Archive

FigShare

Additional file 2: Figure S1. of Potential and active functions in the gut microbiota of a healthy human cohort

Author: Alessandro Tanca (494538)
Antonio Palomba (494539)
Cristina Fraumene (374294)
Edoardo Fiorillo (518797)
Francesco Cucca (145742)
Marcello Abbondio (3706183)
Sergio Uzzau (186221)
Valeria Manghina (3498188)
Publication venue
Publication date
Field of study

Principal component analysis plots related to taxonomic and functional features. MG data are in blue, while MP data are in red. Each dot (with different shape) represents a different human subject. (A) phyla; (B) genera; (C) KOGs; (D) KOG-phylum combinations. (PNG 2001 kb

FigShare

Additional file 5: Dataset S2. of Potential and active functions in the gut microbiota of a healthy human cohort

Author: Alessandro Tanca (494538)
Antonio Palomba (494539)
Cristina Fraumene (374294)
Edoardo Fiorillo (518797)
Francesco Cucca (145742)
Marcello Abbondio (3706183)
Sergio Uzzau (186221)
Valeria Manghina (3498188)
Publication venue
Publication date
Field of study

Relative abundance and differential analysis outputs concerning Firmicutes and Bacteroidetes KOGs, according to MG and MP data. (XLSX 101 kb

FigShare

Percentage of missing non-reference genotypes (i.e. false negatives) per individual in families for variants called by joint modeling family data and the standard approach of ignoring relatedness for sequencing coverage between 5× and 30× and for input sequence data with Phred-scaled quality of 20 (error rate of 1% per base) or 30 (error rate of 0.1% per base) without mapping error.

Author: Bingshan Li (174021)
Carlo Sidore (145691)
Fabio Busonero (186172)
Francesco Cucca (145742)
Gonçalo R. Abecasis (36598)
Hyun M. Kang (108067)
Serena Sanna (145737)
Wei Chen (23863)
Xiaowei Zhan (304150)
Publication venue
Publication date
Field of study

For all scenarios 300 sequenced individuals were simulated.</p

FigShare

Mismatch rates (%) of 4 categories of genotypes by the reference allele frequencies for pedigrees of quartet (two siblings and their parents) with base quality Q20 at 15× without mapping error.

Author: Bingshan Li (174021)
Carlo Sidore (145691)
Fabio Busonero (186172)
Francesco Cucca (145742)
Gonçalo R. Abecasis (36598)
Hyun M. Kang (108067)
Serena Sanna (145737)
Wei Chen (23863)
Xiaowei Zhan (304150)
Publication venue
Publication date
Field of study

The 4 categories are (A) overall genotypes, (B) homozygous alternative allele, (C) heterozygotes and (D) homozygous reference allele.</p

FigShare

The receiver operating characteristic (ROC) curves of PolyMutt and the standard methods for de novo mutation (DNM) detection from empirically calibrated alignments of simulated reads with sequencing coverage of 30× with base quality of Q20.

Author: Bingshan Li (174021)
Carlo Sidore (145691)
Fabio Busonero (186172)
Francesco Cucca (145742)
Gonçalo R. Abecasis (36598)
Hyun M. Kang (108067)
Serena Sanna (145737)
Wei Chen (23863)
Xiaowei Zhan (304150)
Publication venue
Publication date
Field of study

PolyMutt (ignoring relatedness) and GATK calls were obtained by jointly calling a trio assuming individuals in a trio are unrelated using Polymutt and GATK respectively.</p

FigShare

Number of false positive de novo mutations per billion bases detected by PolyMutt of jointly modeling for sequencing at coverage 5×–40× with Phred-scaled base quality Q20 (1% error rate) without mapping error in different pedigrees structures.

Author: Bingshan Li (174021)
Carlo Sidore (145691)
Fabio Busonero (186172)
Francesco Cucca (145742)
Gonçalo R. Abecasis (36598)
Hyun M. Kang (108067)
Serena Sanna (145737)
Wei Chen (23863)
Xiaowei Zhan (304150)
Publication venue
Publication date
Field of study

Number of false positive de novo mutations per billion bases detected by PolyMutt of jointly modeling for sequencing at coverage 5×–40× with Phred-scaled base quality Q20 (1% error rate) without mapping error in different pedigrees structures.</p

FigShare

Heterozygous mismatch rates (%) and Mendelian inconsistency rates (%) per site of call sets generated by PolyMutt (family-aware) and the standard approaches using PolyMutt (ignoring relatedness) and GATK from empirically calibrated alignments of simulated reads with base quality of Q20 in the pedigree shown in Figure 1.

Author: Bingshan Li (174021)
Carlo Sidore (145691)
Fabio Busonero (186172)
Francesco Cucca (145742)
Gonçalo R. Abecasis (36598)
Hyun M. Kang (108067)
Serena Sanna (145737)
Wei Chen (23863)
Xiaowei Zhan (304150)
Publication venue
Publication date
Field of study

Heterozygous mismatch rates (%) and Mendelian inconsistency rates (%) per site of call sets generated by PolyMutt (family-aware) and the standard approaches using PolyMutt (ignoring relatedness) and GATK from empirically calibrated alignments of simulated reads with base quality of Q20 in the pedigree shown in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002944#pgen-1002944-g001" target="_blank">Figure 1</a>.</p

FigShare

Genotype mismatch rates (%) for different family structures with sequencing coverage of 5×, 15×, and 30× and input bases with Phred-scaled quality Q20 (1% error rate) or Q30 (0.1% error rate) without mapping error.

Author: Bingshan Li (174021)
Carlo Sidore (145691)
Fabio Busonero (186172)
Francesco Cucca (145742)
Gonçalo R. Abecasis (36598)
Hyun M. Kang (108067)
Serena Sanna (145737)
Wei Chen (23863)
Xiaowei Zhan (304150)
Publication venue
Publication date
Field of study

The mismatch rates are shown for 4 genotype categories: all genotypes (All), homozygous reference allele (HomRef), heterozygotes (Het), and homozygous alternative allele (HomAlt).</p

FigShare

Three-generation extended pedigrees.

Author: Bingshan Li (174021)
Carlo Sidore (145691)
Fabio Busonero (186172)
Francesco Cucca (145742)
Gonçalo R. Abecasis (36598)
Hyun M. Kang (108067)
Serena Sanna (145737)
Wei Chen (23863)
Xiaowei Zhan (304150)
Publication venue
Publication date
Field of study

A) is a 3-generation extended pedigree with numbers labeling the individual heterozygous genotype mismatch rates (%) at coverage of 15× with base quality of Q20 without mapping error and panel B) labels the corresponding mismatch rates for the standard approach of ignoring relatedness. Panel C) and D) display the heterozygous mismatch rates (%) when a fixed sequencing effort of 150× is allocated differently to family members: Panel C) is for the situation where the founders are allocated 30× while non-founders have 5× and in Panel D) founders and non-founders have coverage of 6× and 21× respectively.</p

FigShare

Learning mutational graphs of individual tumour evolution from single-cell and multi-region sequencing data

Additional file 2: Figure S1. of Potential and active functions in the gut microbiota of a healthy human cohort

Additional file 5: Dataset S2. of Potential and active functions in the gut microbiota of a healthy human cohort

Mismatch rates (%) of 4 categories of genotypes by the reference allele frequencies for pedigrees of quartet (two siblings and their parents) with base quality Q20 at 15× without mapping error.

The receiver operating characteristic (ROC) curves of PolyMutt and the standard methods for <i>de novo</i> mutation (DNM) detection from empirically calibrated alignments of simulated reads with sequencing coverage of 30× with base quality of Q20.

Number of false positive <i>de novo</i> mutations per billion bases detected by PolyMutt of jointly modeling for sequencing at coverage 5×–40× with Phred-scaled base quality Q20 (1% error rate) without mapping error in different pedigrees structures.

Genotype mismatch rates (%) for different family structures with sequencing coverage of 5×, 15×, and 30× and input bases with Phred-scaled quality Q20 (1% error rate) or Q30 (0.1% error rate) without mapping error.

Three-generation extended pedigrees.